test(amber): move state-mat e2e into amber-integration#5682
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5682 +/- ##
============================================
- Coverage 53.09% 52.90% -0.20%
- Complexity 2546 2625 +79
============================================
Files 1076 1090 +14
Lines 41762 42208 +446
Branches 4513 4531 +18
============================================
+ Hits 22174 22329 +155
- Misses 18278 18569 +291
Partials 1310 1310
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 434 | 0.265 | 22,101/34,168/34,168 us | 🔴 +11.2% / 🟢 -7.1% |
| 🟢 | bs=100 sw=10 sl=64 | 938 | 0.573 | 105,323/126,264/126,264 us | 🟢 -12.3% / 🟢 -9.7% |
| ⚪ | bs=1000 sw=10 sl=64 | 1,103 | 0.673 | 908,775/940,376/940,376 us | ⚪ within ±5% / 🟢 -8.1% |
Baseline details
Latest main 6becb85 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 434 tuples/sec | 439 tuples/sec | 410.82 tuples/sec | -1.1% | +5.6% |
| bs=10 sw=10 sl=64 | MB/s | 0.265 MB/s | 0.268 MB/s | 0.251 MB/s | -1.1% | +5.7% |
| bs=10 sw=10 sl=64 | p50 | 22,101 us | 21,754 us | 23,785 us | +1.6% | -7.1% |
| bs=10 sw=10 sl=64 | p95 | 34,168 us | 30,738 us | 34,980 us | +11.2% | -2.3% |
| bs=10 sw=10 sl=64 | p99 | 34,168 us | 30,738 us | 34,980 us | +11.2% | -2.3% |
| bs=100 sw=10 sl=64 | throughput | 938 tuples/sec | 973 tuples/sec | 891.94 tuples/sec | -3.6% | +5.2% |
| bs=100 sw=10 sl=64 | MB/s | 0.573 MB/s | 0.594 MB/s | 0.544 MB/s | -3.5% | +5.3% |
| bs=100 sw=10 sl=64 | p50 | 105,323 us | 100,392 us | 112,277 us | +4.9% | -6.2% |
| bs=100 sw=10 sl=64 | p95 | 126,264 us | 144,023 us | 139,802 us | -12.3% | -9.7% |
| bs=100 sw=10 sl=64 | p99 | 126,264 us | 144,023 us | 139,802 us | -12.3% | -9.7% |
| bs=1000 sw=10 sl=64 | throughput | 1,103 tuples/sec | 1,110 tuples/sec | 1,041 tuples/sec | -0.6% | +6.0% |
| bs=1000 sw=10 sl=64 | MB/s | 0.673 MB/s | 0.678 MB/s | 0.635 MB/s | -0.7% | +5.9% |
| bs=1000 sw=10 sl=64 | p50 | 908,775 us | 899,717 us | 972,714 us | +1.0% | -6.6% |
| bs=1000 sw=10 sl=64 | p95 | 940,376 us | 979,146 us | 1,023,057 us | -4.0% | -8.1% |
| bs=1000 sw=10 sl=64 | p99 | 940,376 us | 979,146 us | 1,023,057 us | -4.0% | -8.1% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,460.72,200,128000,434,0.265,22100.89,34168.24,34168.24
1,100,10,64,20,2132.07,2000,1280000,938,0.573,105322.65,126264.45,126264.45
2,1000,10,64,20,18126.89,20000,12800000,1103,0.673,908775.49,940376.32,940376.32b9a13f2 to
9f13e26
Compare
Drop the macOS-only `pyamber-state-materialization-mac` diagnostic job and the sqlite-backed `SqlCatalog` override it relied on. Mark the two e2e tests with @pytest.mark.integration so they run in the `amber-integration` ubuntu job, which already provisions postgres + iceberg catalog DB + MinIO + Lakekeeper and runs `pytest -m integration` as its last step. Catalog initialization now mirrors test_iceberg_document.py:45 — postgres-backed JdbcCatalog with a tempdir warehouse — so we exercise the real prod catalog code path instead of an sqlite shim. The `StorageConfig.initialize` call is wrapped in a session-autouse fixture (rather than at module import) so it co-exists with test_iceberg_document.py's import-time initialize regardless of pytest collection order. The unit-style `test_process_start_channel_persists_produce_state_on_start_output` that the mac job also ran is unaffected: it monkeypatches the output manager and is already picked up by the regular `pyamber` job's `pytest -m "not integration"` step. Closes apache#5681
9f13e26 to
009d9e4
Compare
Signed-off-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com>
The macos-latest matrix entry was failing at "Set up job" / "docker:
command not found" because GitHub-hosted macOS runners have no
Docker and `services:` containers are Linux-only. Provision the same
stack natively on macOS:
* postgres@16 via brew, with a `postgres` superuser created so the
psql steps stay OS-agnostic;
* minio via `brew install minio`, backgrounded with nohup;
* lakekeeper from the upstream aarch64-apple-darwin tarball
(published v0.11.0 onward — same version as the Linux docker tag);
* minio-mc via brew for the warehouse bucket-init step;
* protoc release asset switched on \$RUNNER_OS.
The job-level `services.postgres` block is also removed since it's a
Linux-only feature; both branches now start postgres via a step.
Each docker-dependent step branches on \$RUNNER_OS inside its `run:`
script so the YAML keeps one step per concept rather than
Linux/macOS pairs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
protoc 3.19.4 predates upstream arm64-mac builds — the previous osx-aarch_64 URL 404'd. GitHub's Apple Silicon macOS runners ship Rosetta 2 preinstalled, so the x86_64 binary runs transparently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aglinxinyuan
left a comment
There was a problem hiding this comment.
The change is correct. I will review it after the CI passes.
protoc's import resolver walks each -I in order. With `-I=/usr/local/include` listed first, ambermessage.proto's `import "org/apache/.../controlcommands.proto"` caused protoc to stat `/usr/local/include/org/apache/...`. On Linux that path returns ENOENT and protoc falls through to the next -I; on macOS the unzip-created /usr/local/include lacks +x for the runner user, so the stat returns EACCES and protoc aborts. Widen the existing chmod from `/usr/local/include/google` to `/usr/local/include` so the parent dir is traversable on both OSes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The macOS run dies with `protoc-gen-python_betterproto: Plugin failed with status code 1` and no plugin stderr is surfaced. Wrap the script so a failure dumps python/plugin paths, attempts to import betterproto, and reports protoc's arch — enough to tell whether the plugin is missing, mis-imported, or hitting an arch mismatch. Remove this once the macOS path is stable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rch mismatch
Diagnostics from the last run revealed:
* /usr/local/bin/protoc was Mach-O x86_64 (running under Rosetta);
* protoc-gen-python_betterproto lived in
/Users/runner/hostedtoolcache/Python/3.11.9/arm64/bin/ (arm64-only
setup-python);
* the actual betterproto package was installed into the universal
Framework Python at /Library/Frameworks/Python.framework/...
The x86_64-under-Rosetta protoc could exec the arm64 plugin shim, but
the arch/site-packages split meant the import inside the plugin landed
in a Python that didn't have betterproto wired up the same way, hence
"Plugin failed with status code 1" with no stderr surfaced.
Switching macOS to brew's arm64-native protobuf keeps the protoc /
plugin / python triad on a single arch (arm64), sidestepping all of
this. The version drifts from the pinned 3.19.4 used on Linux, but
python_betterproto's output for proto3 sources is determined by the
betterproto package version, not protoc — so the python codegen
stays bit-identical across the two paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aglinxinyuan CI looks green now. Could you please check again? |
There was a problem hiding this comment.
Pull request overview
This PR migrates the Python state-materialization end-to-end tests to run as part of the amber-integration CI job (instead of a separate macOS-only diagnostic job), and updates the test setup to exercise the production-like Iceberg catalog path.
Changes:
- Updated
test_state_materialization_e2e.pyto use a postgres-backed IcebergJdbcCatalogpath, added@pytest.mark.integration, and moved initialization into an autouse fixture. - Refactored
amber-integrationin.github/workflows/build.ymlto provision Postgres/MinIO/Lakekeeper on both Linux and macOS runners, and removed the dedicatedpyamber-state-materialization-macjob.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| amber/src/test/python/core/architecture/packaging/test_state_materialization_e2e.py | Switches e2e test from sqlite catalog override to postgres-backed catalog and marks it as integration. |
| .github/workflows/build.yml | Removes the old macOS diagnostic job and updates amber-integration infra provisioning / OS matrix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…-gen diagnostics Address Copilot review: * Replace hardcoded S3 credentials (minioadmin/us-east-1) with os.environ.get(...) reads that match the same STORAGE_S3_* env vars the production code consumes via storage.conf, defaulting to the storage.conf defaults (texera_minio/password/us-west-2). The current test payload doesn't actually push large binaries, but the previous hardcoded creds would have diverged from the CI infra the moment any code-path under test reached for S3 — better to align now. * Promote the _init_storage_config fixture from function-scope to class-scope. It force-resets the StorageConfig + IcebergCatalogInstance singletons and allocates a fresh tempdir warehouse — once per class is sufficient and avoids leaking N tempdirs / N reinitializations across the two tests in this class. Also drop the temporary diagnostics block around `bash bin/python-proto-gen.sh` in the amber-integration job. It was wired up to surface python / plugin / arch info on macOS proto-gen failures; the brew protobuf switch landed two commits ago and the macOS path is now green, so the safety net is no longer pulling its weight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI ran `ruff format --check` and flagged the file from the previous commit (long os.environ.get(...) lines were wrapped vs joined). No semantic change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What changes were proposed in this PR?
Two things:
1. Move the state-materialization e2e tests into
amber-integration. The two e2e specs that previously ran in the deletedpyamber-state-materialization-macdiagnostic job now run under the existingamber-integrationjob, which already provisions postgres + iceberg catalog DB + MinIO + Lakekeeper and runspytest -m integrationas its last step. The test's catalog backend switches from sqliteSqlCatalogto the real postgres-backedJdbcCatalog, matchingtest_iceberg_document.py:45, so we exercise the prod catalog code path instead of an sqlite shim.2. Add macOS to the
amber-integrationmatrix. We want CI coverage that the integration stack actually runs on macOS (where most dev machines live), not just Linux. GitHub-hosted macOS runners have no Docker, so each docker-dependent step now branches on$RUNNER_OS: macOS provisions postgres / minio / lakekeeper natively (brew install+ the upstream lakekeeperaarch64-apple-darwintarball — published from v0.11.0 onward, same version as the Linux docker tag) while Linux keeps the existing docker path. Protoc on macOS uses brew's arm64-native protobuf because protoc 3.19.4 has no arm64-mac build and running its x86_64 binary under Rosetta breaks thepython_betterprotoplugin (arch / site-packages split between Rosetta'd protoc and arm64 setup-python). For proto3 sources, the plugin output depends onbetterproto, not protoc, so the protoc version drift on macOS is benign for codegen.pyamber-state-materialization-macmacOS diagnostic job (build.yml:742)SqlCatalog(sqlite) injected in module fixtureJdbcCatalog, matchingtest_iceberg_document.py:45pytest -sv <path>from that job@pytest.mark.integration+ picked up byamber-integration'spytest -m integrationamber-integrationruns only onubuntu-22.04amber-integrationruns on[ubuntu-22.04, macos-latest]with$RUNNER_OS-branched service provisioningStorageConfig.initializeis wrapped in a class-scoped autouse fixture (rather than called at module import time) so it co-exists withtest_iceberg_document.py's import-timeinitializeregardless of pytest collection order. All catalog + S3 credentials read the sameSTORAGE_*env vars the production code consumes (via storage.conf), with defaults that match storage.conf — so the test stays aligned with whichever identity the surrounding CI infra uses.The unit-style
test_process_start_channel_persists_produce_state_on_start_outputthat the deleted mac job also ran is untouched: it monkeypatches the output manager and is already picked up by the regularpyamberjob'spytest -m "not integration"step.Any related issues, documentation, discussions?
Closes #5681.
How was this PR tested?
Locally against the existing texera-dev infra (postgres on 5432 with
texera_iceberg_catalogschema initialized viasql/iceberg_postgres_catalog.sql):And the regular pyamber suite still green:
In CI, the
amber-integrationjob picks these tests up automatically because they're marked@pytest.mark.integration, on bothubuntu-22.04andmacos-latest.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)